The Effectiveness of Satisfying the Assumptions of Predictive Modelling Techniques: An Exercise in Predicting the FIFA World Cup 2006
نویسنده
چکیده
The assumptions of statistical procedures are enforced more rigorously in some disciplines than in others. Outliers are often removed from data sets due to concerns over measurement error. However when predicting the outcomes of sports performances, such outliers represent real and valid performances such as Germany’s 8-0 win over Saudi Arabia in the 2002 FIFA World Cup. Previous research into the accuracy of predictive modelling techniques has provided examples of where models based on data that violate the relevant assumptions is greater than that of models where the assumptions were satisfied. The purpose of this investigation was to intentionally develop two sets of 6 models; one set being based on untransformed data that violated the assumptions of the modelling techniques and a second set where the data were transformed and outliers were removed in order to satisfy the assumptions of the modelling techniques. Data from 477 pool matches and 165 knockout matches from World Cups, European Championships, Copa America tournaments and African Cup of Nations tournaments from May 1994 to February 2006 were used to produce predictive models of match outcomes (win, draw or lose) or goal difference with respect to the higher ranked teams within matches according to the FIFA World rankings. The independent variables used were difference between the teams FIFA World rankings, difference between distance from capital city to capital city of the host nation, and difference in recovery days from previous match within the tournament. The two sets of models were used to predict the 2006 FIFA World Cup and 22 human predictions and 20 weighted random predictions were also produced. An evaluation process marked the predictions with respect to the actual outcomes of matches in the 2006 FIFA World Cup out of a total possible score of 64 points. The mean accuracy of the models where the assumptions were satisfied was 38.67 points which was similar to the 39.00 points for those where the assumptions were violated. However, the best individual model was a simulator where the assumptions of the underlying multiple linear regression technique used were satisfied (44.00 points). The multiple linear regression based models were more accurate than those based on discriminant function analysis and binary logistic regression. The accuracy score of the 12 model based predictions of 38.83+3.26 was significantly lower than the 42.95+3.36 for the human predictions (P < 0.017) but significantly greater than the 31.05+3.86 for the weighted random predictions (P < 0.017). These results provide evidence that challenges the value of satisfying the assumptions of discriminant function analysis, binary logistic regression and multiple linear regression.
منابع مشابه
The Effectiveness of Satisfying the Assumptions of Predictive Modelling Techniques: An Exercise in Predicting the FIFA World Cup 2010
The assumptions of statistical procedures are enforced more rigorously in some disciplines than in others. Previous research into the accuracy of predictive modelling techniques has provided examples where models based on data that violate the relevant assumptions is greater than that of models where the assumptions were satisfied. The purpose of this investigation was to develop two sets of 4 ...
متن کاملThe role of information technology and perceived organizational support on the performance of the referees of the 2018 FIFA World Cup in Russia
Background and Aim: Given the widespread trade in some sports, a misjudging decision can not only have huge financial consequences for competitors, but also affect the event itself. The overall purpose of this study was to determine the role of information technology and perceived organizational support on the performance of the referees of the 2018 FIFA World Cup in Russia: to investigate the ...
متن کاملFactors Influencing the Accuracy of Predictions of the 2014 FIFA World Cup
The purpose of this paper was to compare the accuracy of different simulation models of the 2014 FIFA World Cup. There were 12 (2 x 3 x 2) models altogether (2 data sets of previous matches, 3 sets of variables and models where the data either satisfied the assumptions of linear regression or not). One set of previous data consisted of 440 matches from all international tournaments played since...
متن کاملThe Assumptions Strike Back! A Comparison of Prediction Models for the 2011 Rugby World Cup
Four studies out of a series of 6 previous studies have found that predictive models are more accurate at predicting actual match outcomes when the modelling assumptions are violated than when data are transformed to satisfy the assumptions. The current investigation produced two sets of two predictive models of the 2011 Rugby World Cup; one set of models used raw independent variables that vio...
متن کاملRanking World Cup 2014 Football Matches by Data Envelopment Analysis Models with Common Weights
Football is one of the most popular and exciting sports fields throughout the world. Today, in addition to the result, the number of goals and points, attraction and quality of the played matches are important for club management staff, coaching staff, the players and especially the fans. Beside number of goals, there are different criteria such as successful passes, attacks, defenses, tackles ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. J. Comp. Sci. Sport
دوره 5 شماره
صفحات -
تاریخ انتشار 2006